Generic eukaryotic core promoter prediction using structural features of DNA.

نویسندگان

  • Thomas Abeel
  • Yvan Saeys
  • Eric Bonnet
  • Pierre Rouzé
  • Yves Van de Peer
چکیده

Despite many recent efforts, in silico identification of promoter regions is still in its infancy. However, the accurate identification and delineation of promoter regions is important for several reasons, such as improving genome annotation and devising experiments to study and understand transcriptional regulation. Current methods to identify the core region of promoters require large amounts of high-quality training data and often behave like black box models that output predictions that are difficult to interpret. Here, we present a novel approach for predicting promoters in whole-genome sequences by using large-scale structural properties of DNA. Our technique requires no training, is applicable to many eukaryotic genomes, and performs extremely well in comparison with the best available promoter prediction programs. Moreover, it is fast, simple in design, and has no size constraints, and the results are easily interpretable. We compared our approach with 14 current state-of-the-art implementations using human gene and transcription start site data and analyzed the ENCODE region in more detail. We also validated our method on 12 additional eukaryotic genomes, including vertebrates, invertebrates, plants, fungi, and protists.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Characterization of Eukaryotic Core Promoters Based on Nonlinear Dimensionality Reduction

Characterization and identification of eukaryotic promoter is important for the gene prediction and genome annotation. In this paper, we study the structural characteristics of the core promoters in several eukaryotes through a series of DNA physicochemical properties and adopt a method that combines the alignment and average of multiple promoters and the nonlinear dimensionality reduction tech...

متن کامل

Designing and Development of a DNA Vaccine Based On Structural Proteins of Hepatitis C Virus

Background: Hepatitis C virus (HCV) infection is one of the most prevalent infectious diseases responsible for high morbidity and mortality worldwide. Therefore, designing new and effective therapeutics is of great importance. The aim of the current study was to construct a DNA vaccine containing structural proteins of HCV and evaluation of its expression in a eukaryot...

متن کامل

Common DNA structural features exhibited by eukaryotic ribosomal gene promoters.

Nucleotide sequences of DNA regions containing eukaryotic ribosomal promoters were analysed using strategies designed to reveal sequence-directed structural features. DNA curvature, duplex stability and pattern of twist angle variation were studied by computer modelling. Although ribosomal promoters are known to lack sequence homology (unless very closely related species are considered), invest...

متن کامل

ProSOM: core promoter prediction based on unsupervised clustering of DNA physical profiles

MOTIVATION More and more genomes are being sequenced, and to keep up with the pace of sequencing projects, automated annotation techniques are required. One of the most challenging problems in genome annotation is the identification of the core promoter. Because the identification of the transcription initiation region is such a challenging problem, it is not yet a common practice to integrate ...

متن کامل

The RNA polymerase II core promoter.

The events leading to transcription of eukaryotic protein-coding genes culminate in the positioning of RNA polymerase II at the correct initiation site. The core promoter, which can extend ~35 bp upstream and/or downstream of this site, plays a central role in regulating initiation. Specific DNA elements within the core promoter bind the factors that nucleate the assembly of a functional preini...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome research

دوره 18 2  شماره 

صفحات  -

تاریخ انتشار 2008